Approximations based on random Fourier features have recently emerged as anefficient and formally consistent methodology to design large-scale kernelmachines. By expressing the kernel as a Fourier expansion, features aregenerated based on a finite set of random basis projections, sampled from theFourier transform of the kernel, with inner products that are Monte Carloapproximations of the original kernel. Based on the observation that differentkernel-induced Fourier sampling distributions correspond to different kernelparameters, we show that an optimization process in the Fourier domain can beused to identify the different frequency bands that are useful for predictionon training data. Moreover, the application of group Lasso to random featurevectors corresponding to a linear combination of multiple kernels, leads toefficient and scalable reformulations of the standard multiple kernel learningmodel \cite{Varma09}. In this paper we develop the linear Fourier approximationmethodology for both single and multiple gradient-based kernel learning andshow that it produces fast and accurate predictors on a complex dataset such asthe Visual Object Challenge 2011 (VOC2011).
展开▼